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Spatial representation of temporal information through spike timing dependent 
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We suggest a mechanism based on spike time dependent plasticity (STDP) of synapses to store, 
retrieve and predict temporal sequences. The mechanism is demonstrated in a model system of 
simplified integrate-and-fire type neurons densely connected by STDP synapses. All synapses are 
modified according to the so-called normal STDP rule observed in various real biological synapses. 
After conditioning through repeated input of a limited number of of temporal sequences the system 
is able to complete the temporal sequence upon receiving the input of a fraction of them. This is 
an example of effective unsupervised learning in an biologically realistic system. We investigate the 
dependence of learning success on entrainment time, system size and presence of noise. Possible 
applications include learning of motor sequences, recognition and prediction of temporal sensory 
information in the visual as well as the auditory system and late processing in the olfactory system 
of insects. 



I. INTRODUCTION 

Animals are challenged in various ways to learn, pro- 
duce, reproduce and predict temporal patterns. A promi- 
nent example are the numerous motor programs neces- 
sary to interact efficiently with the environment. One 
specific manifestation is the vocal motor system of song 
birds. It has been shown that the temporal sequence 
of syllables in a bird's song corresponds to temporal se- 
quences of bursts in the neurons of the forebrain control 
system 0, 0, 0- These are learned and stored by the 
adolescent bird. 

Temporal codes seem to be used for a variety of other 
tasks as well. Temporal coding in the retina [j] is an ex- 
ample, as is information transport in the olfactory system 
of the locust. In the latter it has been shown that the 
purely identity coded information of the receptor neurons 
is transformed into an identity-temporal code inside the 
antennal lobe H H ■ 

Whereas there is a long history of research on sequence 
learning and recognition in the framework of abstract 
neural networks (cf the relevant chapters in |8| and |9j 
and references therein) it is an open question how the 
learning and memory of time sequences is accomplished 
in real biological neural systems. Three main principles 
for representing time in neural systems are frequently 
discussed: 
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• The first makes use of delays and filters. There are 
various ways of proces sing of temporal information 
in the dendritic tree lift fTll Il2l [Til or through 
axonal delays [Ii3]llllMal3 Other ex- 
amples are multilayer neural networks in which the 
delay of the synaptic connections between layers al- 
lows to represent or decode temporal information 
and propagatin g wa ves as known from the thalamo- 
cortical system j2ltl22j . 

• The second principle rests on feedback. Through 
delayed feedback temporal information can be pro- 
cessed on the level of individual neurons as well 
as on the level of larger structures. A prominent 
example for this are recurrent multi layer neural 
networks which play a role in sequence memory in 
the hippocampus |2.ll'2 1| . 

• The third principle is to transform the temporal in- 
formation into spatial information. This can occur 
through the dynamics of a network with asymmet- 
ric lateral inhibition |25j . 

In this paper we demonstrate an alternative mechanism 
which maps the temporal information to the strength 
of synapses in a network through spike timing depen- 
dent plasticity (STDP). Similar mechanisms have been 
suggested for predictive activity and direction selectivity 
in the visual system |2fjj | and learning in the hippocam- 
pus |U 0) E3 as wen as prediction in hippocampal 
place fields and route learning in rats jH, [23, Is3|- In 
contrast to these earlier works we focus on questions of 
learning of several distinct input sequences in one system 
and a sparse coding scheme. This learning capability is 
necessary in order to process the identity-temporal code 
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believed to be generated by winnerless competition in 
sensory systems 0, E] • 

Synaptic plasticity in the connections among neurons 
allows networks to alter the details of their interaction 
and develop memories of previous input signals. The 
details of the methods by which biological neurons ex- 
press plasticity at synapses is not fully understood at the 
biophysical level, but many aspects of the phenomena 
which occur when presynaptic and postsynaptic neurons 
are jointly activated are now becoming clear. First of all, 
it seems well established that activity at both the presy- 
naptic and the postsynaptic parts of a neural junction is 
required for the synaptic strength to change. Arrival of a 
presynaptic action potential will induce, through normal 
neurotransmitter release and reception by postsynaptic 
receptors, a postsynaptic electrical action which gener- 
ally leads to no change in the coupling strength at that 
synapse. Depolarization of the postsynaptic cell by var- 
ious means coupled with arrival of a presynaptic action 
potential can lead to changes in synaptic strength in a 
variety of experimental protocols. It is quite important 
that changes in the synaptic strength, which we denote 
in terms of a conductivity change Ag can be either pos- 
itive, called potentiation, or negative, called depression. 
When the expression of Ag is long lasting, several hours 
or even much longer after induction, increases in g are 
called long term potentiation or LTP, and decreases in g 
are called long term depression or LTD. Good reviews of 
the current situation are found in [32l l33l l34| . 

LTP and LTD can be induced by (1) depolarizing the 
postsynaptic cell to a fixed membrane voltage and pre- 
senting presynaptic spiking activity at various frequen- 
cies, by (2) inducing slow (LTD) or rapid (LTP) release 
of Ca 2+ |35j , or by (3) activating the presynaptic terminal 
a few tens of milliseconds before activating the postsy- 
naptic cell, leading to LTP, or presenting the activation 
in the other order, leading to LTD |3a.l37|. 

In this paper we study numerically a network com- 
posed of integrate-and-fire neurons which are densely 
coupled with synaptic interactions whose maximal con- 
ductances are permitted to change in accordance with the 
observations on closely spaced spike arrival times to the 
presynaptic and postsynaptic junctions of the synapse. 

The response of a learning synapse to the arrival of 
a presynaptic spike at t prc and a postsynaptic spike at 
^post is a function only of At = t pos t — t prc and for At > 
Ag(At) is positive, LTP, and for At < Ag(At) is 
negative, LTD. 

At -At 

Ag(At) = A + — e T + for At > 

At At 

Ag(At)=A_ — e T - for At < (1) 

T_ 

where A + , r + and r_ are positive constants (see Fig. 

Synaptic plasticity of this type is often referred to 
as spike timing dependent plasticity (STDP). For many 
mammalian in vitro or cultured preparations the charac- 
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FIG. 1: STDP learning rule. Ag = A + ^e~ At/T + for At > 
and Ag = A-^e At/r - for At < 0, A+, A- > 0. This form 
of the learning rule was directly inferred from experimental 
data |37| 

tcristic LTD time t_ is about two or three times longer 
than the characteristic LTP time r+. 

Here we inquire how a network composed of famil- 
iar integrate-and-fire neurons can develop preferred spa- 
tial patterns of connectivity when interacting through 
synapses which update their strength according to the 
STDP learning rule just given. This rule is a simplifi- 
cation, which applies for our setting of spikin g ne urons, 
of more general models [H H3 HJ EJ El El H which 
indicate how Ag( At) behaves under stimulus of arbitrary 
presynaptic and postsynaptic waveforms. 

The transformation of temporal information into 
synapse strength through STDP maps a temporal se- 
quence of excitations of neurons to a chain of stronger or 
weaker synapses among these neurons. If the synapses 
are excitatory, a strengthened chain of synapses facili- 
tates subsequent excitations of the same temporal pat- 
tern up to a point that activation of a few neurons from 
the temporal sequence allows the system to complete the 
remaining sequence. The temporal sequence thus has 
been learned by the system. We demonstrate this type of 
sequence learning in a computer simulation of a system 
with integrate-and-fire neurons and Rail type synapses 
and investigate the reliability of learning, the storage ca- 
pacity in terms of the number of stored sequences, the 
scaling of both with system size and sequence length and 
the robustness against different types of noise. 



II. MODEL SYSTEM 
A. Components and Connections 

To explore the learning principle we simulated a net- 
work with the topology shown in Fig. [21 In this network 
n integrate-and-fire neurons are connected all-to-all while 
each neuron also receives input from one "input neuron" 
(filled ovals in Fig.0). 

The "input neurons" generate rectangular spikes of 



3 





FIG. 2: Morphology of the model system. The ovals are arti- 
ficial input neurons producing rectangular spikes of 3 ms du- 
ration at specified times. Each is connected by a non-plastic 
excitatory synapse to one of the neurons in the main "cortex" 
(dotted lines). The full circles depict the integrate-and-fire 
neurons of the main "cortex". They are connected all-to-all 
by STDP synapses shown as solid gray lines. The big full 
circle on the right depicts a neuron with slow Calcium dy- 
namics which inhibits all neurons in the "cortex" through the 
non-plastic synapses shown as dashed lines. 
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FIG. 3: Typical piece of a training session. The rectangu- 
lar spikes in the upper panel are the input signal spaced by 
10 ms in this example. The traces in the middle panel are the 
integrate-and-fire memory neurons. The slow spike train in 
the bottom panel belongs to the globally inhibitory neuron. 
Note the instantaneous onset of the spikes in the integrate- 
and-fire neurons and how the inhibitory neuron segments the 
input into pieces of 6 spikes each. 



3 ms duration at times determined by externally chosen 
input sequences. Each of these spikes is sufficient to trig- 
ger exactly one spike in the receiving neuron (see Fig. 
EJJ. The input sequences are chosen such that only one 
"input neuron" spikes at any given time and the time be- 
tween input spikes was fixed in the normal test setup. In 
section IIVI these input neurons are replaced by Poisson 
neurons with random spike times. 

The membrane voltage of the integrate-and-fire neu- 
rons used in this study for sub-threshold activity is given 
by 

C— = -5Loak(V r (t) - VLeak) + ^Synapso(i) (2) 

where C = 0.2 nF, g^cak = 0.3 /iS and VLeak = —60mV. 
Whenever the membrane potential V(t) reaches Vth = 
— 40 mV, it is set to the firing voltage V max — 50 mV, 
kept at that voltage for ifi rc = 2ms and then released 
into the normal integration state. The neuron is sub- 
sequently refractory for t rc f rac t = 40ms before another 
firing event is allowed. During the refractory period the 
neurons integrates normally but the transition of the fir- 
ing threshold has no effect. In the implementation of 
integrate-and-fire neurons used in this work no crossing 
of the firing threshold from below is necessary to elicit 
a spike in a super-threshold neuron after the refractory 
period. See Fig. [3 middle panel, for typical spike forms. 

A neuron connected to all neurons in the network 
(large filled circle in Fig. |2J provides global inhibition 
whenever the activity in the network exceeds a certain 
threshold. The inhibitory neuron is an integrate-and- 
fire neuron governed by (J2J with C = 1.0 nF, <?Lcak = 
0.01 fiS, V Lcak = -60 mV, V th = -40 mV, V max = 50 
and tfire = 5 ms. In contrast to the memory neurons this 
neuron is reset to its resting potential VLeak after each 
firing. Then the membrane potential is fixed to V^eak for 
^refract = 10 ms until normal integration resumes. The in- 
hibitory neuron was implemented as a resetting integrate- 
and -fire neuron because it has a very weak leak current 
allowing integration over long time windows. This weak 
leak current would cause very unnatural broad spikes in 
a non-resetting neuron. A typical voltage trace is shown 
in the lowest panel of Fig. [3J 

Our model of the synapses comes from Rail pH l4fj| 
and now is a standard model for simplified synaptic dy- 
namics |47||. In particular, we use 

^Synapse = ~ <7synS(*) (Vpost(i) ~ V syn ) , (3) 

where g(t) satisfies 
df(t) 1 



dt 
dg(t) 



(0(V pic (t) - v th ) - /(f)) 



, =—(f(t)-g(t)), 

dZ 7"syn 



(4) 



20 mV, r syn = 15ms, V pie (t) 



and V syn = mV, V th 

and Vp OS t (t) are the pre- and postsynaptic membrane po- 
tentials and <7 syn is the strength of the synapse. O(it) 
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0, u < and 9 (it) = 1, u > is the usual Heaviside func- 
tion. Typical EPSPs generated by these synapses can be 
seen in the middle panel of Fig. 

The synaptic strength of the internal synapses is ad- 
justed according to the synaptic plasticity rule shown in 
in Fig. n whenever a spike in their presynaptic and post- 
synaptic neuron occurs. In itself, this rule may lead to 
"run-away" behavior of the synaptic strengths. While 
this may be avoided in the dynamical model of synaptic 
plasticity |38| , we need to address this within the simpler 
model used here. We do so by two approaches: (1) we 
add a long term, slow decay to the synaptic plasticity 
which would, all other factors being absent, bring it back 
to a nominal allowed level a long time after alteration by 
our rule. This wc implement with 

^jT" = -(flW(<) ~ ffO.raw) (5) 

at Tg 

where <?o.raw is the initial value of the unmodified synapse 
strength. So after potentiation or depression according 
to the synaptic plasticity rule, the synaptic strength is al- 
lowed to slowly decay back to its original value. The time 
scale of this exponential decay is set by r g = 200 s. (2) 
g law is an intermediate variable which is then translated 
into the synaptic strength <7 syn via a sigmoid saturation 
rule 

ffsyn = SWxf ( tanh (pslopeOraw - #1/2)) + l) , (6) 

where <7 max is the largest allowed value for the synaptic 
conductivity, and g 1 / 2 sets the threshold where satura- 
tion to this value is implemented. All data shown in this 
work was obtained with g max = 2.8 /j,S, g-yi-x = l/2g max 
and Aslope = l/ffi/2- I n addition the globally inhibitory 
neuron tends to curb the tendency of the network to sat- 
urate its synaptic strengths. 

These features of our model reflect our lack of knowl- 
edge of the biophysical factors setting the synaptic 
strength in the first place and our equivalent lack of 
knowledge how these factors bound the eventual rise 
or fall of synaptic strength. Our assumption in using 
these rules is that the actual mechanisms, while surely 
more complicated in detail, will provide the same effec- 
tive bounding feature. 

The complete system is realized in C++ usin g an order 
5/6 variable time step Runge-Kutta algorithm |4g|. The 
error goal per time step was 10 -7 in all simulations. A 
run of 100 simulated seconds of a system with 50 neurons 
takes about 3 hours on an Athlon 1.4 GHz processor. 

This model system mimics the situation of a highly 
connected piece of cortex receiving input from the neu- 
ral periphery. Our input can be interpreted in two ways. 
It might be a single strong excitatory postsynaptic po- 
tential (EPSP) received from an upstream neuron which 
is strong enough to trigger a spike. It could also be in- 
terpreted as the coincidence of several weaker EPSPs re- 
ceived from various presynaptic neurons being sufficient 
to cause a spike. 



B. Operations and Activity 

To test the ability of this network to store (learn) and 
retrieve (remember) temporal-identity patterns it was 
trained with sets of randomly chosen sequences of in- 
puts. These sequences were chosen without repetition of 
neurons within the sequence. Note that this implies a 
minimal time of the order of the length of the sequence 
between spikes in each neuron. For this reason the choice 
of resetting or non-resetting neurons is not important 
as the integration times of the neurons are small com- 
pared to the total length of the sequences and the time 
scale of the global inhibition. Our choice of non-resetting 
integrate-and-fire neurons was mainly guided by the more 
natural spike form of the non-resetting variety. 

The sequences were presented continuously with the 
first neuron of the sequence following the last with the 
same time delay as the neurons within the sequence. The 
global inhibition of the system partitions this continuous 
input of spikes into pieces of about 6 — 8 spikes at a 
time. Between these input windows the whole system is 
inhibited and thus reset. This mechanism can be seen 
in the example training session shown in Fig. [3] Note 
that the details of the global inhibition mechanism do 
not matter as long as the system is efficiently reset after 
an appropriate amount of activity. 

The learning rate A + and the time scale of forgetting 
T g in the synaptic plasticity learning rule were chosen 
such that learning reaches a steady state after a learning 
time of about 1600Ai, where At is the fixed inter spike 
interval between input activations. For an example of 
the learning protocol see Fig. |3| In all studies described 
below At was chosen as At — 10 ms, 15 ms or 20 ms. 
The learning rule has to accommodate all these input 
speeds and possibly values in between. In particular we 
here chose A + = 0.3 fiS, A- = 2/3 A + , t + — 16ms, 
T_ =3/2 t + and r g = 200 s. 

After the training phase the network was presented 
with pieces of the training patterns. We presented all 
possible ordered pieces of one to four input spikes and 
recorded the number and identity of spiking neurons in 
the network in response to this input. Perfect learning of 
the patterns would correspond to obtaining a spike from 
each of the network neurons in a given pattern when pre- 
senting a piece of two or three inputs from that pattern 
to the "input neurons" . Furthermore, all other network 
neurons should remain inactive if the pattern is repro- 
duced exactly. 

As a result of incomplete or ineffective learning two 
types of errors can occur. (1) Neurons which should be 
excited within the given pattern do not spike or (2) neu- 
rons which are not supposed to spike do so. Due to over- 
lap of input patterns, the learning efficiency is a function 
of the number of learned patterns as well as the size of 
the network. Therefore, estimating the expected amount 
of overlaps in the randomly chosen input sequences pro- 
vides information about the optimally achievable system 
performance. 



The probability distribution for the number Yij r k n of 
ordered j-tuples occurring in at least i out of r patterns 
with k neurons each for a system with a total number 
of n neurons can be calculated in the following way: 
First consider a given ordered j-tuple and a given pattern 
with k neurons. The sequence is presented continuously 
and therefore needs to be interpreted as cyclically closed. 
Thus there are k possibilities to position the j tuple in 
the sequence (starting at neuron 1 to starting at neuron 
k) and (n — j)\/(n — j — (k — j))! possibilities to choose 
the remaining neurons in the sequence. The total num- 
ber of sequences of length k is nl/(n — k)\. Therefore, the 
probability pj to have a given ordered j-tuple in a given 
pattern with k active neurons is given by 



Pj 



= k 



(n-j)\ j n\ 
(n-k)\l (n-k)\ 



= k 



(n-j)\ 



(7) 



If r sequences of length k are chosen independently, the 
probability to have any given ordered j-tuple of neurons 
in i or more of the r sequences is given by the binomial 
distribution with parameters r and pj , 
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(8) 
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FIG. 4: Comparison of the expectation values for 12,2,10,8,™ 
(lower line) and X 3t2 ,io,8,n (upper line) obtained from JUJ and 
11211 to the normalized number of occurrences of unordered 3- 
tuples (gray dots) and ordered 2-tuples (black dots) in more 
than 2 sequences in 100000 randomly generated sets of 10 
sequences of length 8. The inlay shows a closeup of the data 
on ordered tuples in the region with system size around 50 
neurons which is the size used in most numerical simulations. 



In good approximation one can treat the events of one 
given j-tuple being in i or more sequences and another 
j-tuple being so as independent. In this approximation 
the probability distribution for Yij r ^ n is again a binomial 
distribution with parameters ( n "j)i and Pj j 



P(Xi 



ijrkn 







I 



(p)Y(i- p ))i^w 



(9) 



Fig.0|shows a comparison of the expectation value for 
El2,2,io,8,n obtained from this approximate distribution 
compared to the relative number of occurrences in 100000 
randomly generated sets of 10 sequences of length 8. 
There is no significant discrepancy which demonstrates 
the precision of the estimate. 

The probability distribution for the number Xij r k n of 
unordered j-tuples occurring in at least i out of r patterns 
with k neurons each for a system with a total number of n 
neurons can be calculated pretty much in the same way. 
Now the the probability pj to have a given unordered 
j-tuple in a given pattern with k active neurons is 



P.i 



n-j 
k-j 



(10) 



Then, the probability to have any given unordered j- 
tuple of neurons in i or more of r independently chosen 
patterns is the binomial distribution with parameters r 
and pj , 



Pi = 



(Pi) s (l-Pi) r 



(11) 



Again taking the approximation of assuming indepen- 
dence for the occurrence of distinct tuples this leads once 
more to a binomial distribution, now with parameters (™) 

and p) , 



P(X,, 



ijrkn 



0» (9)^) i ( 1 -Pi) (?H - ( 12 ) 



The comparison of the expectation values EA2,3,io,8,n 
with numerically observed relative numbers of occurrence 
in Fig. 01 shows again a perfect match. 

The model parameters were chosen such that two to 
three spiking predecessors of a given neuron in a trained 
sequence are sufficient to excite that neuron. The learn- 
ing performance is therefore poor as long as there is a 
significant amount of ordered 2-tuple overlaps in the pat- 
terns. The rule of thumb ~&Y?irkn < 0.5 for the expecta- 
tion value of Yijrkn, provides an estimate for number r 
of pattern of length k that can be successfully stored 
in a system of n neurons. Another estimate for the 
number of learnable sequences is provided by the rule 
of thumb EA23 r fc„ < 0.5, i.e. the overlaps in input se- 
quences should have negligible impact on the learning if 
there is no significant amount of unordered 3-tuples oc- 
curring in more than one pattern. 

Typically capacity estimates are given in the limit of 
the system size n tending to infinity. As shown in Ap- 
pendix^ the leading term of the Taylor expansion of p l - 
with respect to pj around pj = is 



( Pj y + o((p J y +1 ) 



(13) 



200 400 600 800 

system size n 



FIG. 5: Estimate for the maximum storage capacity of the 
system. The dashed lines divide the plane into two regions 
with EY2,2,r,fc,50 > 0.5 above and E>2,2,r,fc,so < 0.5 below for 
k — 8 (topmost line), 10 (middle line), and 12 (lowest line) 
respectively. The thin solid lines are the corresponding esti- 
mates for the asymptotically correct values r(50, k, |). The 
dash-dotted lines analogously mark the boundaries between 
regions with EX 2 ,3, r ,fc,50 > 0.5 above and EX2,3, r ,fc,50 < 0.5 
below. Again the thin lines are the asymptotically correct 
estimates r(50, k, |) 

such that the asymptotic equation 

lim EY ijrkn = e (14) 

71— >OQ 

leads to 

iwoo {n — jy.\i/\ n\ / 

^ li m r lk l n- l{l - 1] = e (16) 
such that the capacity r(n, fc, e) is asymptotically 

1 1 j(i-l) 

r(n, k, e) = — (i\e) ; n ; . (17) 

K 

In the same way 

lim EXijrkn = e (18) 

n — >oo 

leads to 

r(n,k,e) = ^ ■ (19) 

The dashed lines in Fig. are some examples for the 
first rule of thumb EY22rfcn = \ and the thin solid 
lines are the corresponding values of r(n, k, The 
estimates based on the rule EX 2 3r.fc„ = h are shown 
as dash-dotted lines in Fig. reffigure5 and the corre- 
sponding values of the asymptotically correct f(n, k, ^) 



FIG. 6: Simple example of a learned identity-temporal pat- 
tern. The neurons at the corners of the octagon have been re- 
peatedly excited in clockwise order. The width and grayscale 
of the connections encodes the strength of the corresponding 
synapse and the small circle at the end shows its direction. 
As one clearly can see the temporal pattern is transformed 
into an ordered spatial pattern by synaptic plasticity. 



are again shown as thin solid lines. The correspon- 
dence between the exact evaluation of the capacity es- 
timators and the asymptotically correct capacity func- 
tions r(n,k,e) and f(n, fc,e) is noteworthy. The rela- 
tive capacities r'{k) :— kr{n,k,e)/n > = (ile)^ and 
f'(fc) :— kr{n,k,e)/n ' *" = (^j- e )^ behave quite 

differently. Whereas the former is constant with respect 
to k the latter is falling in k. So, depending whether 
a system is more sensitive to ordered tuple overlaps or 
to unordered tuple overlaps, the relative capacity is con- 
stant or falling in A:. In particular for systems sensitive 
to unordered tuple overlaps it will be beneficial to store 
many short sequences instead of a few long ones. 



III. RESULTS 

The synaptic plasticity of synapses transforms time se- 
quences of excitation of neurons into directed spatial pat- 
terns as intended. A simple example is shown in Fig.|Blfor 
one input pattern. For randomly chosen input sequences 
the patterns are structured in the same way but are not 
so easy to detect with the human eye. 

During training the synapses between consecutively ac- 
tive neurons are strengthened if pointing in the direction 
of the activation order or weakened if connecting the neu- 
rons in the wrong direction. An example of the develop- 
ment of the average synaptic strength of synapses be- 
tween neurons of one out of 5 trained sequences is shown 
in Fig. [7| Note that the time course and final strength 
of the synapses depends on the speed with which the se- 
quences are entrained due to the non-constant learning 
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FIG. 7: Development of synaptic strength during training. 
The network of 50 neurons was trained with 5 sequences 
of length 8 in sequential order. The topmost panel shows 
the data for sequences entrained with inter spike interval 
At = 10 ms, the middle with At = 15 ms, and the lowest 
with At = 20 ms. Each sequence was presented for 80 At 
at a time. The data shown are average synaptic strengths 
of synapses between the neurons of one of the trained se- 
quences. The topmost points are the average strengths of all 
synapses between the neurons and their direct successors in 
the sequence, the middle are the corresponding strengths of 
synapses between neurons who are next nearest neighbors in 
the sequence under consideration, and the lower points corre- 
spond to strengths of synapses between neurons with distance 
3 in the sequence. The lowest data points are the strengths 
between the neurons of the sequence as described above but 
against the order of activation in the trained sequence. The 
sharp rises in synaptic strength correspond to training of the 
particular sequence shown here and the falling flanks corre- 
spond to the decay while other patterns are trained. 



curve (JIJ. 

The ability to store more than one pattern was tested 
in various setups. We mainly varied choice, number and 
length of input sequences and the speed of entrainmcnt. 

A typical example for a network of 50 neurons trained 
with 5 sequences of length 8 is shown in Fig. [HJand Fig. 
|5J There are several important features to point out. 
First of all the recall never comprises all 8 neurons of the 
trained sequence but only up to 7 active neurons. This 
is however not a universal feature but rather a charac- 
teristic of the global inhibition circuit shutting down the 
system after ca. 7 spike occurrences, see Fig. |SJ Fur- 
thermore, note that the recall of the sequences speeds up 
toward the end of the sequence. This is partly due to the 
fact that the integrate-and-fire neurons used here do not 
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FIG. 8: Typical recall episodes. The system of 50 neurons 
was trained with 2 (left panel) or 5 (right panel) sequences 
for 1600At per sequence, where At = 10 ms. It then receives 
a cue of two spikes from one of the trained sequences and 
autonomously completes the sequence until stopped by the 
globally inhibitory neuron. Note that although the recall of 
the identity and order of the neurons is perfect in both cases, 
the exact timing is lost. In general one sees a tendency of 
speed-up to the end of the recalled sequence. This can have 
the effect of destroying the correct order of recall in the later 
sequence if the global inhibition is not present. 



have a finite rise time for their spikes which allows them 
to instantaneously affect their postsynaptic neurons. 

In a network with more realistic neurons one would 
expect that there is a lower limit on the speed with which 
sequences can be recalled in the system. Preliminary 
studies with realistic Hodgkin-Huxley type neurons show 
this effect j^. It has clear advantages for maintaining 
the correct order of recall in the system. The microscopic 
internal dynamics of the neurons thus seems to be non- 
negligible for the macroscopic performance of the system. 
This will be discussed in more detail in forthcoming work. 

The quality of recall of sequences depends very much 
on the sequence and the piece presented as a cue. This is 
however also no surprise because sequence overlaps occur 
at certain neurons in the sequence and if these are used as 
a cue the performance is less good as when other neurons 
are used. In Fig. [5] one can see how some sequences are 
reproduced very well and error free while others lead to 
activation of quite a few incorrect neurons. 

To test for the capacity of the system systematically 
we trained a network of 50 neurons with 2 up to 10 se- 
quences of length 8. For each number of sequences 5 inde- 
pendent sets of randomly chosen sequences were tested. 
Fig. shows the average response of the trained sys- 
tems to pieces of 2 inputs taken from the learned se- 
quences. The averages are over all possible subsequences 
and all 5 input sequence sets for each data point. This 
experiment was done with 3 different input speeds, i.e. 
the input was presented with fixed inter spike intervals 
of length At = 10 ms, 15 ms and 20 ms. As one can 
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FIG. 9: Examples of learning in a 50 neuron network after 
1600 At sequential training with 5 input sequences of length 
8. The left and the right panel show results for two indepen- 
dently chosen sets of 5 input sequences labeled with numbers 
to 4 in each set. The filled symbols show the average number 
of spiking neurons within a tested sequence and the open sym- 
bols erroneously spiking neurons. The test cue were fractions 
of length 2 from the trained sequences. The circles were ob- 
tained with a training speed of At = 10 ms, the squares with 
At = 15 ms, and the triangles with At = 20 ms. Note that 
the results depend on the structure of the input set. Whereas 
in the left case all sequences have some overlap, in the right 
case sequence and sequence 3 are pretty much disjoint from 
the others. 



see in Fig.^|the performance dramatically decreases for 
the slowest entrainment speed. This is due to the fact 
that the fixed width of the learning window in JQ) leads 
to weaker synapses for all the synapses in this case as 
spikes are separated further in time, see last row of Fig. 
|7| The minimum and maximum possible speed of the 
entrainment are thus directly determined by the learn- 
ing window. If one chooses a larger learning window the 
slower sequences could be entrained as well. However, 
this would also lead to decreased performance for faster 
sequences. 

To test for the dependence of learning success on the 
length of presented sequences we entrained a 50 neuron 
system with sets of 5 sequences of length 6 to 9. Fig. ITT1 
shows the performance of the system. On first sight it is 
surprising that the system performs less good for shorter 
sequences. Naively one would expect a better perfor- 
mance because overlaps are less likely. Indeed one really 
can see that the number of erroneous spikes is smaller. 
On the other hand the number of correct spikes is also 
considerably smaller. This is due to the fact that the 
spikes preceding a given spike in a sequence are also suc- 
ceeding it because of the periodic presentation of the se- 
quences (see e.g. Fig. |3J). Synapses between the corre- 
sponding neurons are therefore enhanced as well as de- 
pressed. For shorter sequences the last presentation of 
the sequence is closer and the depression effect therefore 
stronger leading to lesser overall synapse strength, cf Fig. 
IT21 This creates the fewer retrieved spikes for shorter se- 
quences in Fig. ^] To some extent this can be seen as 
an artifact because longer learning time or slightly larger 
learning increments A + could diminish this effect. On 
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FIG. 10: Scaling of storage quality with the number of input 
sequences. A system with 50 neurons was trained with a 
varying number of input sequences of length 8. The figure 
shows the response after a total of 1600At training for each 
input sequence. The filled symbols show the average number 
of responding neurons within a tested sequence and the open 
symbols the number of incorrectly responding neurons. The 
test cues were pieces of 2 inputs from the trained sequences. 
The circles were obtained with sequences trained with inter 
spike intervals At = 10 ms, the squares with At = 15 ms, and 
the triangles with At = 20 ms. All data points are averages 
of trials with 5 independently chosen sets of input sequences. 
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FIG. 11: Scaling of storage quality with the length of input 
sequences. A system with 50 neurons was trained with sets of 
5 input sequences of different lengths. The figure shows the 
response after a total of 16 s training for each input sequence. 
The filled symbols show the average number of responding 
neurons within a tested sequence and the open symbols the 
number of incorrectly responding neurons. The test cues were 
pieces of 2 inputs from the trained sequences. The circles were 
obtained with sequences of length 6, the squares with length 
7, the triangles with length 8, and the diamonds with length 
9. All data points are averages of trials with 5 independently 
chosen sets of input sequences. 
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FIG. 12: Development of synaptic strength during training 
of a sequence of length 6 with speed At — 10 ms. The net- 
work of 50 neurons was trained with 5 sequences of length 6 
in sequential order. Each sequence was presented for 80At 
at a time. The data shown are average synaptic strengths 
of synapses between the neurons of one of the trained se- 
quences. The topmost points are the average strengths of all 
synapses between the neurons and their direct successors in 
the sequence, the middle are the corresponding strengths of 
synapses between neurons who are next nearest neighbors in 
the sequence under consideration, and the lower points corre- 
spond to strengths of synapses between neurons with distance 
3 in the sequence. Note how the synaptic strength for these 
synapses is suppressed because a spike being the third pre- 
decessor of a given spike is also the third successor of this 
spike due to cyclic training. The lowest data points are the 
strengths between the neurons of the sequence as described 
above but against the order of activation in the trained se- 
quence. 



the other hand this might have negative effects on the 
performance of the system in other parameter regions. 



IV. ROBUSTNESS 

Biological neural systems are subject to various exter- 
nal and internal noise sources. Starting from internal 
thermal noise within the system this ranges over noisy or 
unreliable input and influences from other parts of the 
organism up to external electromagnetic fields. To test 
the effect of noise on the learning success of our model 
systems we focused on two types of noise. We imple- 
mented a Gaussian white noise in the membrane poten- 
tial of the integrate-and-fire neurons and we implemented 
unreliable input. 

The internal white noise was added to the membrane 
potential of each neuron independently. It is fully char- 
acterized by its mean, mV and its variance for which 
several values between 0.2mV and 1.0 mV were tested. 

To simulate unreliable input wc implemented Poisson 
input neurons. These neurons produce rectangular spikes 
of width £ sp ikc = 3 ms as before but the time of spiking is 
stochastic. The spike times are determined by the Pois- 
son distribution 



-P(^spikc — fe) — 6 



kl 



(20) 



where n sp jk c is the number of spikes occurring in an 
interval of length t and the parameter A is the mean 
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FIG. 13: Impact of Gaussian white noise in the membrane 
potential. The data points are the number of spiking neurons 
within tested sequences after 2400A£ training at At = 10 ms 
(full symbols) and the number of erroneously spiking neurons 
(open symbols). The small symbols were obtained when the 
noise was only present during learning and the large ones 
when noise was always present. The circles correspond to a 
cue of two inputs in testing and the squares to a cue of three 
inputs. 



firing rate. For small t this can be approximated by 
-P(n sp ikc = 1) = At, P(n spikc = 0) = 1 — Af and 
P(n sp ikc = k) = for k > 1. This is the probability dis- 
tribution we use to decide whether a neuron fires within 
a time step of the Runge Kutta algorithm used. After 
firing the neurons are refractory for i ro fract = 10 ms. The 
training protocol is that the mean firing rate of the first 
neurons is switched from to some activity level A on for 
2At, the next neuron is switched on after At for also 
2At and so on. Different reliability of the input can be 
adjusted by the parameter A on . 

Figs. El and [21 show the impact of the two types of 
noise on the learning performance. Fig. 1131 shows the ef- 
fect of additive white noise at the membrane potential in 
the learning stage and in both learning and recalling. As 
mentioned, the standard deviation of the noise was cho- 
sen between 0.3 mV and 1.5 mV. The system seems to 
be more or less unaffected by noise of this magnitude. As 
to be expected the learning is even less sensitive to noise 
than the recalling due to the fact that the effect of the 
temporally uncorrelated noise on the synaptic strength 
is averaged out over time. 

Fig. 1141 shows the learning success if the input neurons 
fire stochastically during learning only and during learn- 
ing and recall as described above. The parameter A on 
was varied from 60 to 160 Hz. The stochastic firing of 
the input neurons seem to only affect the overall number 
of spikes, i.e. correct spikes as well as incorrect ones but 
not their ratio. This indicates that mainly missing input 
spikes during the training and especially during testing 
are responsible for the decreased spikes in the response. 
It is to be expected that longer training can diminish 
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FIG. 14: Impact of noisy input on the learning performance. 
The input sequences were provided by stochastic Poisson neu- 
rons as described in the text. The data points are the num- 
ber of spiking neurons within tested sequences after 2400A£ 
training at At = 10 ms (full symbols) and the number of erro- 
neously spiking neurons (open symbols). The small symbols 
were obtained when the stochasticity of the input was only 
present during learning and the large ones when input was al- 
ways stochastic. The circles correspond to a cue of two inputs 
in testing and the squares to a cue of three inputs. 



these effects even more. Like in the case of noise in the 
membrane potential the learning stage is not affected as 
much by the noisy input as the recall. Again the same 
argument applies, the effects of the stochasticity of the 
input spikes is averaged out over time during the multiple 
repetitions in the training phase. 



V. DISCUSSION 

It has been demonstrated that STDP allows the trans- 
formation of temporal information into spatial informa- 
tion providing an efficient mechanism for storing tem- 
poral sequences which does not require a sophisticated 
network topology. It is however not obvious how to 
quantify the storage capacity of the system from the ob- 
served recall performance for different numbers of stored 
sequences. Taking the heuristic rule to allow for success- 
ful storage on average one incorrect spike in recall, the 
capacity of a system of 50 neurons is about 5 — 6 se- 
quences (see Fig. The capacity estimates for n = 50 
and k = 8 are r(8,50, f) w 6.3 and f(8,50,±) ~ 2.6. 
The storage capacity of the system therefore seems to be 
mainly limited by the statistical properties of the input, 
i.e. the overlap probabilities for randomly chosen input 
sequences. The biologically found STDP learning rule 
obviously does not imply severe restrictions on the abil- 
ity to learn sequences but on the contrary seems to be 
very well suited for this task. There are indications that 
the learning mechanism is even more reliable with bio- 



logically more realistic conductance based model neurons 
which have non-trivial intrinsic dynamics which to some 
extent prevents the speedup in recall already discussed 
above. 

The successful storage of arbitrary input sequences, 
however, crucially depends on the existence of the cor- 
responding synapses making the all-to-all connections in 
the investigated system a necessary requirement. In re- 
alistic systems such global all-to-all connections can not 
be found, but this might be compensated through diver- 
gence and redundancy of the input. If the density of 
connections and the number of neurons each input ex- 
cites is high enough, pairs of connected neurons being 
excited by successive inputs will appear on a statistical 
basis. This mechanism will be discussed more thoroughly 
in forthcoming work. 

The realistic implementation of saturation of synaptic 
strength for additive learning rules is another important 
topic. For the system investigated here we implemented 
a combination of two mechanisms. On the one hand the 
synaptic strength was directly bounded by use of the 
sigmoid filtering function applied to the bare synaptic 
strength subject to the additive learning rule, a tech- 
nique commonly used by biologists. On the other hand 
the steady decay of synaptic strength and the continu- 
ous stimulation of the network by the inputs lead to a 
dynamical steady state thereby bounding the synaptic 
strength dynamically. 

Whereas the direct bound through a sigmoid filtering 
function might capture some aspects of the behavior of 
real synapses, the decay of synaptic strength necessary to 
achieve a realistic dynamical steady state is clearly too 
fast to be realistic. The system forgets much too fast if 
it is not continuously stimulated with appropriate input. 

Alternative solutions to the saturation problem include 
competition based mechanisms suggested by recent find- 
ings of interactions of various kinds between neighboring 
synapses on a dendritic tree [Hcf and learning rules which 
depend on the synaptic strength itself like e.g. multiplica- 
tive learning rules. 

The system is reasonably robust against noise. It is 
noteworthy that it is not very sensitive to internal high- 
frequency noise. In the range of noise applied in our trials 
the recall barely depended on the level of noise (see Fig. 
IT51 Whether this is an effect of the integrate-and-fire 
neuron model used here is beyond the scope of this work. 
The tolerance to biologically more relevant noise in the 
spike timing of the input is also rather impressive tak- 
ing into account that A on = 60 Hz corresponds to a total 
firing probability of only 36% for each of the input neu- 
rons within their activity window of 20 ms. Nevertheless 
the system still was able to store at least parts of the 
presented sequences at this high noise level. 
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APPENDIX A: TAYLOR EXPANSION OF pf 

We first need to proof the identity 



d n min{s,n} 



0^(1-,)—= £ o {s % (r jqgi fc)) , Q) - »)--<-»), (ai) 



dx 

k— max{ n+s — r,0} 



The proof is by induction. Let n — 0. Then the equation which is clearly true. Assuming the validity of lAljl for 
reduces to n we can calculate 

- x y-* = Q f ^(r^(0 )( _ 1) o^ (1 _ x y-s 

(A2) 



d 
dx 



min{s*,n} . 

E (s %tr-^: m (t) (-i)"- fc ^ fe (i - *)-'- in - k) ) (A4) 

k — max { n + s — r , } 
min{s.n} 

= E p=fa (r J r -A». - s)-- (*-*> (as) 

k— max{n+s— r,0} 
min{s,n} 

+ E Q (A- (r-VU)- a)(-l)" +1 - fc ^(l - »)— ^-*). (A6) 

A: — max { n+s — r , } 

i 



Shifting the index in the first sum by one, using the well 
known identity + = ("fc 1 ) and obvious identities 
like 1 = ("J 1 ) one obtains equation (|AlJl for n + 1 which 



completes the proof. 

The Taylor expansion for p l , is then straightforward: 



p} = i-E(>id-^) 

oo i— 1 

—EE 



min{s,n} 

En s! (r-s)\ (n\( lNn-fe„s-fc/i \r-s-(n-fc) 
U (33fcyT( r _ s _(„_ fc )), ifcjl" 1 ; Pj U-PjJ 

n=l s=0 x t=max{n+s-r,0} 

I 



(PjY 



(A7) 
(A8) 



Pi=0 



For all k < s the n-th derivative contains a non-zero s > n then all k are less then s and therefore the whole 
power of pj and is thus = at pj = 0. Furthermore, if 
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sum over k is empty. We end up with Therefore, the leading term of the Taylor expansion of p l - 

is 

oo min{i-l,n} . > n 

p} = -E E Of^C)(-ir- s ^- (A9) 

oo min{i— l,n} 

= -E E C)C)(-ir- s fe)"- (aw) 

" =1 S= ° Pj=n(pj) 1 + 0((PJ) <+1 )- (A12) 

For any n < i — 1 the inner sum is 



s=0 
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